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Claims 1-12 and 14-23 are pending in the application. 
Claims 1-12 and 14-23 have been rejected 

Reconsideration and allowance of the Claims is respectfiJly requested. 

I. PFTFmONS I TNTtFtt 35 U.S.C^$-LM 

Claims 1. 4-6 and 9-12 were rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Sharman (US 5.744.854) in vi^ of Hata, et al. (US 5.878,393) and "New Riverside University 
Dictionary" ("DIC"). 

Claims 2-3 and 21-23 were rejected under 35 U.S.C.^ § 103(a) as being unpatentable over 
Sharman(US 5.744.854) in view of Hata. et al. (US 5.878.393) andDIC. and fbrther in view of Oh 
(US 6,141,642). 

Claim 7 was rejected under 35 U.S.C § 103(a) as being unpatentable over Shatman (US 
5,744.854) in view of Hata, et al. (US 5,878.393) and DIC, and fbrfher in view ofMicrosoft Press, 
"Computer Dictionary", page 298 ("Rl"). 

Claim 8 was rejected under 35 U.S.C. § 103(a) as being unpatentable over Sharman (US 
5,744,854) m view of Hata, et al. (US 5.878,393) andDIC, and Rl . and further m view of O'Donnell 
("Programming For The World - A Guide To Memationalization", ISBN 0-1 3-7221 90-8). ). 

Claims 14-15 and 19-20 were rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Shannan (US 5,744.854) in view of Hata, et al. (US 5,878,393) and DIC. and further in view of 
Malsheen, et al (US 4,979.216). 
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Claims 16 and 18 were rejected under 35 U.S.C. § 103(a) as being unpatentable over 
Shaimaa (US 5,744,854) in view of Hata, et al. (US 5,878,393) and DIG and Rl. and further in view 
of Oh (US 6,141,642). 

Claim 17 was rejected under 35 U.S.C. § 103(b) as being unpatentable over Sharman (US 
5,744,854) in view of Hata, et al. (US 5.878,393) and DIG, and Rl, and O'Donnell CTrogramming 
ForTheWorld.AGuideToInternalionalization^ISBN0.13.722l90-8),andfur^^ 

(US 6,141.642). 

The rejections are respectfully traversed. 

AppUcant again respectfully notes that to establish n prima facie case of obviousness, three 
basic criteria must be met. First, there must be some suggestion or motivation, either in the 
references themselves or in the knowledge generally available to one of ordinary skill in the ait, to 
modify the reference or to combine reference teachings. Second, there must be a reasonable 
expectation of success. Finally, the prior art reference (or references when combined) must teach 
or suggest all the claim limitations. ImportanUy. the teaching or suBRestion tn make the claimed 
invention and ti^. re^.n.^\e exDe r.t.t;nii of success must both be found in the prior art, and m 
based on applicant's disclosure. MPEP § 2142. 

AppUcant respectfully submits, as detailed hereafter, that the Examiner's suggested 
combinations of references are improper, as they rely on selective hindsight reconstruction guided 
by Applicant's Claims and disclosure -not on the prospective teachings and suggestions of the cited 
references. 

a. Claim 1. 

Applicant respectfully traverses the Examiner's suggested interpretations of the cited 
references. 
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ApplicanttespcctfUlysubmitsthatShannandisclosesatexttospe^^^ 
input text into an output acoustic signal simulating natural speech. (Col. 1, lines 65-67). Shannan 
disclosesbreakingdo^wordsintoco„stituentsyllables.Adiction^ 

ofdeterminingsyllabiobteaks. (Col. 5, lines 22-25). Constituent syllables are then further broken 
downintoconstituentphonemes.Adictionarylook-uptable-orsomegeneraIpurpose™^^ 
beused for thispurpose.(Col.5,lines 22-25). Sharrnandiscbses removal anddis^^^^ 
possible prefix or suffix, so as to enable the disaggregation of an underlying v,otd into phon^^ 

(Col. 5, lines 26-29). 

Shamiandisclosesannotationofphonemeswxthcertaincharacteristics^ 

(Col. 5, lines 22-25). Steps are then performed to assemble phonemes into "breath groups." (Col. 
5, line'48 - Col. 6, line 16). After the constituent phonies and their characteristics have been 
dltermined. the acoustic processor detennines diphones from the constituent phonemes. Each 
diphonerepresentsatransitionbetweentwophonemes.Adiphonelibrary(v^thpre^^^^ 

of the diphones) is accessed to retrieve die corresponding diphone samples, ^.hich are then 
concatenated to produce the output signal. See. generally. Col. 5, line 18-40^ Col. 6. lines 22-38. 

Therefore. Sharmanparses the text file down to constituentphonemes. A group ofphonemes 
aiematched to diphones from adiphoneUbrary. Sound units are then grouped and generated using 
diphones. Shamiansuggeststhatitsapproachisdesirabletoavoidexcessprocessing.(Co^^ 

22-33). 

The entire Sharman reference teaches away from processing whole words. 

Thus far, the Examiner has conceded that: "Sharman fails .0 expUciUy disclose utihzing 
'speech sample' for the speech in the diphone library for the phonetic data"; "Shannan does not 
expressly disclose 'each speech sample corresponding to a one of said words, ... in said 
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vocabulaxy'^-Shaxmandoesnotexpresslydisclose'eachspeechs^plec^^^^^ 
said, .. . prefixes and suffixes in said vocabulary'"; "Shannan does not expressly disclose 'wherein 
saidvocabularyoftexturaUsiclunit comprises words ... each having a pre-recorded speech sample 
associated therewith'"; and ''Shannan does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises ... prefixes and suffixes each having a pre-recorded speech sample 

associated therewith,'" 

Applicant agrees with these concessions. 

By comparison, Hata discloses a high-quahty concatenative reading system for converting 
an input siring into a sequence for subsequent audible synthesis. (Col. 1. lines 64-66). Hata 
discloses a dictionary of words and a word list generator coupled to the dictionary- Ti^ word list 
generator receives the input string and builds a word list from words stored in ttie dictionary 
c^trespondingtotheinput string. Itewordlistgeneiatorassignsoneormorepiosodic^^ 
tokens towordltstentries-preferablytoeachentryin the wordlist. A reading system analyzes the 
word Hsttodeterminephonological features on theentries. Based on the wordlist, the envi^^ 
token(s).andthephonologicalfeatures,thereadin6systemselectsspeechsa^^^^ 
to supply the signal for audible synthesis. (Col. 1. line 66 - Col. 2, line 41). 

Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
inadvance. (Col. 3, lines 42-44). Tlxe dictionary includes different samples for each possible pitch 
contour (i.e.. prosodic environment) for each word in the dictionary. (Col. 4, hues 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionaiy may also 
store aU pronunciation variants of each word for each prosodic environment. (Col. 4. lines 37-55). 

Hata's dictionary thus comprises multiple variations of each word entry. 

Hatadisclosesinputoftexttobeconvertedtospeech.(Col.4.1ines58-63). Hatadiscloses 
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a word list generator including a prosodic environment table that identifies the possible prosodic 
statesofthetexttospeechiinplementation. (Col. 4, lines 63-65). The word Ust generator builds a 
word Ust comprising a word token, and a prosodic token for each word token, arranged in the order 
that they will be pronounced in the output speech. (Col. 5. lines 10-15). A reading system includes 
a phonological feature analyzer. For each word in the word list, the phonological feature analyzer 
evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample ftom the Ubrary having the corresponding prosodic and 
phonological characteristics. This is then added to a sample hst, which is eventually output as the 
speech signal. (Col. 5. lines 16-31). 

Regarding Claim 1. the Examiner postulates that "it would have been obvious to one of 
ordinary skiUin the art at the time the invention was made to combine Sharman and Hata to provide 

a stored speech sample in a word or other larger units (the removed prefix or suffix may be good 
candidate units, since they must associate some pronunciation unit for outputting, anyway)." 
Applicant respectfully disagrees. 

Again, the entire Sharman reference teaches away from processing whole words, 
la order to combine the Shaiman and Hata references as the Examiner has suggested - 
without the benefitofhindsight-oneofordinary skill in the artwould have to:l)findandread the 

Sharmanreference; 2) understand Shaiman's phoneme and diphone-based structures and operations; 
3) spontaneously decide-ihspiteofShaiman'steaching away firom processing whole words-that, 

out of all of Sharman's system, only its elemental and basic stored diphones needed to be replaced 
with stored speech samples in word or larger units; 4) seek out and find Hata's system; 5) disregard 
all of Hata' s structures and methods, except for its Ubrary of pre-recorded samples fox a given word 
and each tonal variation thereof; 6) selectively cull fiom Hata only its extensive library of pre- 
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recoided samples for a given word and each tonal variation thereof; 7) disregard Shaiman's 
expressed intent to avoid excess processing; 8) replace Shannan's diphone library with Hata's 
extensive library of pie-recorded samples for a given word and each tonal variation thereof; 9) 
discardShainian-sextensiveteachingof,andstnictureand operations for.break^^ 
their constituent syllables; 10) discard Sharman's structure and operations for phoneme duration 
assignment; 11) discard Sharman's structure and operations for breath group assembly; and 12) 
modify Sharman's remaining diphone-based structures and operations to successfully process and 
output Hata's pre-recorded word samples. 

Applicant respectfully submits that there is no teaching or motivation in either Sharman or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the ait, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

The Examiner then goes on to speculate that "itwould have been obvious to one of ordinary 
skill in the art at the time the invention was made to modifyShannan in viewofHata for specifically 

providing a mechanism to treat a prefix or suffix as same way [sic] as a word in a dictionary, as 
taught by Die, so that each stored speech sample can correspond to one of word, prefix and suffix, 
for the purpose of selecting the appropriated [sic] 'granularity' or dictionary entry size to suit the 

specific application." 

Applicant respectfully disagrees. 

In order to further modify the already substantially modified Shaiman/Hata combination as 
theExaminerhassuggeated-withoutthebenefitofhindsight-oneofordinaryskUlintheartwould 
have to: 1) spontaneously decide that -to the extent that either Sharman or Hata disclose or suggest 
treatinentofprefixesor suffixes-bothrefercnces were somehow deficient orincompletewithregaid 
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hereto. 2, K*o..»d«nda»DIC«3,*«myd«ide..»«dupo„>..c.on«,^^ 
for . p«fix 0. suffix, m p»fix« »d .uflbc^ Should be «»«d . wo* .nd 4) teto 
„„ai^ 0.. s^a.r« a^i op^on. of 0,e '^'^^ ^""^^^ 

c„mbinationtotodlel«flx«mlm£Bx«>swonk. 

A^, AppU=». respeotfUIy s^U « .hc« is .o ««htas or modvadou m c,t«l 
^«.g^.sucbbi^.e.e»d«audM^ysp«uUdv.,nodiiioadon»on,of«^ 

sBnin*c,,.or„p«vide.reaso»b.»xp«Udo„ofsuo«sa«uayco»*.e*.such™^ 

r„rf,=rmore.suchsd«tive.ndsubsUndalmod,fio«io>>s«aiM»«.ch»s«880s,all*. 

limitations of Claim 1. 

AppUc^x«p.«Mlys*u.*a.CWm . ov«om«a,.tejeodo„basedupcn.h.*ly 
^«ivc and selecdvo combinado. .h. Shannan and HaU data 1 is aUowablo. 

Applioan. r=sp«:tMly «q.«ts Koonsidoradon mi allo-^Kc of Clam, 1. 

rifltms 4 and 5 

Claims 4 a,»l 5 d^ ftom dlowablo Claim 1, a«d provide linutadons no. laugh. 

or snegeated by eithsr Shaiman or Hala. 

a^4 »4Claim5d,^anding*,erefto.n, requires c^adonofaspccehum. by spHcmg 

„ge*er.plur=li,yofsp.ed.s««pl=sprior»app«>dingd«apeeeh«u<«>.heo..p«.s.^. 

Claims 4 «id 5 are tas allowable, AppUcart respeoriUly r,qu«« reeousideradon and 

allowance of Claims 4 and 5 . 

c. ClaimA 

Applicaiit respeCMIy -averse. <h= Examiner's susges>ed interpreladons of fte d»d 



references. 



Applican. aga-n respec*Uy submits .ha. Sharman disel^ a »x. .0 speech sysU^ «>r 
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conv.rt»ginpm»«tao=B««?u,acou«icsi8»aIsimula.tagn.tv«ds^^^ 

Aepurposcofda^iring^yWcbreab. (Col. 5. .to 2J-25). Cons«u«. »yll-.l« « 
f^r br,k« do»B in« o^<^ phon=m». A dicdo,»nr Iclc-up - « «omc 6en«- 
p„^™l«-m.yb.»s«iforfhi,p«rpos.. (Col. 5.1i.«22.25). Sl»rn«. disclose. r«»v.l»d 
disreg^^g of ^ possiWe proflx or suffix, so as to -b. disaggrcgaHon of » u„d«l,«g 

woid into phonemes. (Col. 5. lines 26-29). 

Shannandi«=lose,annot.tionofphonemeswito<^*«»«erisdcs(e.g.,pit<ai,duia^^^ 

(Col 5 lines 22-25). S«^s are then performed »«semble phonemes into "1™* groups." (Col. 
5 Une 48 - Col 6. line 16). A«« ttie c.nsd««. phonemes and fl«i. charaCertstics have been 
detennined. fte aco^dc p»cessor detennines diphones ftom to conSituent phonemes. Bach 
diphone^p^sertsanansiSonbemeentwophoneme. AdipboneUbr.r,(wi*p«r.»riedsounda 
of to diphones) is accessed to Btrieve to co„cspo»iing diphone S».ple3, which are then 
c^^atenatedto produce to output si^-. Se^ generaUy, Col 5. 18-40; Col. 6. Btea 22-38. 
The,efo,e.Sbarmanpar.esthet»tfiledo»™toconstituen,phonemes.Agroupofpbonem=s 

a^n^tchedto diphones ihnnadiphone library. Soundumtsare ton group^dandg^teratedusing 
diphones. Shamtan^tsd^titsspp^achisdestrabletoavoidntcess processing. (Col. 2.hn=, 

22-33). 

The entire Sharmsnreftaence teaches away ftom processing whole rvords. 
Thus far, to Examiner has conceded dnU: '■Shanatan MU to explicitly disclose udli^ 
^le' f„ to speech in to diphone Ubrary tor to phonetic data- "Sbarman does no. 

expressly disclose "each speech sample eotresponding t, a one of said words, ... » s«d 
vocabulary-; •^ha.mandoesno.exp.esslydisclose-eachspeechsampleco.responding.o.on.of 
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said, ... prefixes and suffixes in said vocabulary'"; "Shaiman does not expressly disclose 'wherein 
said vocabulary oftexturaUsic]unitcompri$es words... each havingapr^recordedspM^^ 
associated therewith'"; and "Shannan does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises . . . prefixes and suffixes each having a pre-recorded speech sample 
associated therewith.'" 

Applicant agrees with these «x)ncessions. 

By comparison, Hata discloses a high-quality concatenative reading system for converting 
an input string into a sequence for subsequent audible synthesis. (Col. 1, Unes 64-66), Hata 
discloses a dictionary of words and a word Ust generator coupled to the dictionary. The word list 
generator receives the input string and builds a word Ust from words stored in the dictionary 
coiTCsponding to the input String. The word Ust generator assigns one or more prosodic environment 
tokens to word Ust entries -preferably to each entry in the word list. Areading system analyzes the 
word listtodetermtnephonological features on theentries. Based on the word list, the environment 
token(s), and the phonological features, the reading systemselects speech samples to be concatenated 
to supply the signal for audible synthesis. (Col. 1, line 66 - Col. 2, Une 41). 

Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
in advance. (Col. 3, Unes 42-44). The dictionary includes different samples for each possible pitch 
contour (i.e., prosodic environment) for each word in the dictionary. (Col 4, lines 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionary may also 
store all pronunciation variants of each word for eachprosodic environment (Col. 4, lines 37-55). 
Hata's dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech. (Col. 4, Unes 58-63). Hata discloses 
a word Ust generator including a prosodic environment table that identifies tiie possible prosodic 
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states ofthe text to speech implementation. (Col 4. Unes 63-65). The word Ust generator builds a 
word Ust comprising a word token, and aprosodic token for each word token, arranged in the order 
that they will be pronounced in the output speech. (Col. 5, lines 10-15). Areading system includes 
a phonological feature analyzer. For each word in the word Ust, the phonological feature analyzer 
evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample from the library having the corresponding prosodic and 
phonological chaiacterisdcs. This is then added to a sample list, which is eventually output as the 

speech signal. (Col. 5, lines 16-3 1). 

Regarding Claim 6, the Examiner postulates that "it would have been obvious to one of 
ordinary skill in the art at the time the invention was made to modify Shaiman for specifically 
providing a stored speech sample corresponding to a word or other unit, as taught by Hata, for the 
purpose of selecting the appropriated [sic] ' granularity' or dictionary entry size to suit the specific 

application." 

Applicant respectfixUy disagrees. 

Again, the entire Shaiman reference teaches away from processing whole words. 
In order to combine the Sharman and Hata references as the Examiner has suggested - 
without the benefit ofhindsight-oneof ordinary skill in the artwouldhave to: Dfind and read the 

Shaimanreference; 2) understand Shamian'sphoneme anddiphone-based structures and operations; 
3) spontaneously decide-in spiteofShaiman'steachingawayfromprocessing whole words-that, 

out of all of Sharman's system, only its elemental and basic stored diphones needed to be improved 
to include larger stored speech samples in word or larger units; 4) seek out and find Hata's system; 
5) disregard all of Hata's stnictures and methods, except for its library of pre-recorded samples for 
a given word and each tonal variation thereof; 6) selectively cM^ ScomTM^onfy its extensive Ubraiy 
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of pre-recotd*) samples for a gi™. wort and «»h tonaJ variaaon a.«=of. 7) disregard Shaman's 
«,r««d intent to a^id excess pioeessins 8) augment Sharman's diphone Ubraiy with Hala's 
^Censive library of prerecorded samples fbr ■ given word and each tonal variation d-ereof, 9) 
selecdvelydiscardShannan-sextensivet.acWngoSMristiuctureandoper.tionsfor,hreakingw 

downi«todieilc»nstitu,,lts,lUWes-.10)selecav.lydisoardSl«rman>sstnK»^ 
phonemedurationassignnKnt;l.)selectivel,disc»dShannan-ss,ruct»reana.p^^ 

group assembly, and 12) substandalty modify and supplement Sharman's syllabic, phonana and 
diphon^bas«Js«c»«sandoperad«,stosuccessfUBypr»essandoutpntHat.'sp.e-r«>^^ 

samples. 

AppUcantrespectfullysubmits that there is no teaching ormotivationm 
Hata to suggest such selective and substantial modification to oneofordin^ 
p„>vide a reasonable expectation of successfully completing such selective and substantial 
modification. 

•n,eExaminera.en8oesontospeenl«oth.fitwouldhavebeenobvioustooneofordins.y 
sldllind,ea,ta.lhetimedKinvendon*.5m«letomodilySham..ninvie«otHaUforspedficd^ 

providing a m«=h«nsm to treat a prefix or suffix as s»n. way [sic] as . word in a dictionary, as 
taught by DK, so that each stored speech sample can correspond to one of word, prefix and suffix, 
for the purpose of selectingtbe appropriated (sic) 'gramdarity o, dictionary entty si« <o suit the 

specific application." 

Applicant respectfully disagrees. 

to order to fitrther modify the already substantially modified Shsnnanrtlata combination as 
d„Examinerhassug»»s.ed-without.hebenefitofWndsi^t-oneofordina,ysldnin.heartwon^ 
have ..:l)spontaneouslyd.cide,h«-tothe«ten.th.teitt.erShannanorHatadiscloseor suggest 
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treatmentofprefixesorsuffixes-bothreferenceswetesomehowdeficientorm^^^^ 
thereto; 2) seek out and find the DIC reference; 3) aibitranly decide, based upon a dictionary entry 
for either a prefix or suffix, that prefixes and suffixes should be treated as words; and 4) further 
modify the structures and operations of the already substantially modified Shannan/Hata 
combination to handle prefixes and suffixes as words. 

Again, Applicant respectfully submits that there is no teaching or motivation in the cUed 
references to suggest such highly selective and highly speculative modification to one of ordinary 
ddUintheart,ortoprovideareasonableexpectationofsuccessfUnycompletingsuchmo^^^^^ 

Furthermore, such selective and substantial modifications still fail to teach or suggest all the 

limitations of Claim 6. 

AppUcant respectfully submits that Claim 6 overcomes the rejection based upon a highly 
speculative and selective combination the Sharman.Hata and DIG references. Claim 6 is allowable. 
AppUcant respectfiilly requests reconsideration and allowance of Claim 6. 
d Claim 9 

AppUcant respectfully traverses the Examiner's suggested interpretations of the cited 

references. 

AppUcant again respectfully submits that Shaiman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1 . lines 65-67). 
Shaiman discloses breaking down words mto constituent syllables. A dictionary may be used for 
the purpose ofdeterminings>11abic breaks. (Col. 5, Unes 22-25). Constitixent syllables are then 
further broken down into constituent phonemes. A dictionary look-up table - or some general 
purpose rules-maybe used forthis purpose. (Col. 5. lines 22-25). Sharman discloses removal and 
disregarding of any possible prefix or suffix, so as to enable the disaggregation of an underlying 
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word into phonemes. (Col. 5, lines 26-29). 

Shannan discloses annotationofphonemes with certain characteiistics(e.g.,pitch,durat^^^ 

(Col. 5, lines 22-25). Steps are then performed to assemble phonemes into "breath groups." (Col. 
5, line 48 - Col. 6, line 16). After the constituent phonemes and their characteristics have been 
dgtetmined, the acoustic processor determines diphones fiom the constituent phonemes. Each 
diphonei«presentsatransitionbetween two phonemes. AdiphonehT>raiy(withprerecordedsoun^ 

of the diphones) is accessed to retrieve the corresponding diphone samples, which are then 
concatenated to produce the output signal. See, generally. Col. 5, line 18-40; Col. 6. hnes 22-38. 

Therefore, Shannan parses the text file down to constituent phonemes. A group of phonemes 
are matched to diphones from a diphone Ubrary. Sound units are Ihen grouped and generated using 
diphones. Sharman suggests that its approach is desirable to avoid excess processing. (Cbl. 2, lines 
22-33). 

The entire Sharman reference teaches away firom processing whole words. 

Thus far, the Examiner has conceded that: "Shannan fails to explicitly disclose utiUzing 
•speech sample' for the speech in the diphone Ubrary for the phonetic data"; "Sharman does not 
expressly disclose 'each speech sample corresponding to a one of said words, ... in said 
vocabulary'"; "Sharman does not expressly disclose 'each speech sample corresponding to a one of 
said, . .. prefixes and suffixes in said vocabulary"'; "Sharman does not expressly disclose 'wherein 
said vocabulary oftextural [sic] unit comprises words ... each having a pm-recorded speech sample 
associated therewith"'; and "Sharman does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises . . . prefixes and suffixes each having a pre-recorded speech sample 
associated therewith.'" 

Apphcant agrees with these concessions. 
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By comparison, Hata discloses a high-quaUty concatenative readii,g system for converting 
an input string into a sequence for subsequent audible synthesis. (Col. 1. lines 64-66). Hata 
discloses a dictionary of words and a word list generator coupled to the dictionary. The word Ust 
generator receives the input string and builds a word list from words stored in the dictionary 
correspondingtotheinprnstring-Thewordlistgeneratorassignsoneormoreprosodicenvi^^^^ 

tokens to word list entries-preferably to eachentryinthe word HstAreadingsystem^^^ 
wordlisttodetern^inephonological features ontheentries-BasedonthewordUst. the envi^^ 

token(s).andthephonologicalfeatur^s.theread^gsystexnselects^^^^ 
to supply the signal for audible synthesis. (Col. 1, Une 66 -Col. 2. line 41). 

Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
inadvance. (Col. 3, lines 42-44). ite dictionary includes different samples for each possible pitch 
contour (i.e., prosodic environment) for each word in the dictionary. (Col. 4. lines 28-31). m 
addition to storing an entry for each prosodic ^vironment of each word, the dictionary may also 
store all pronunciation variants of each word for eachprosodic environment. (Col. 4, lines 37-55). 
Hata's dictionary tiius comprises multiple variations of each word entry. 

Hatadiscloses input of text to be converted to speech. (CoL 4, lines 58-63). Hatadiscloses 
a word list generator including a prosodic envixomnent table that identifies the possible prosodic 
statesofthetexttospeechimplementation. (Col, 4. lines 63-65). The word list genenUor builds a 
wd list comprisingaword token, andaprosodictokenfor each wordtoken,arr3ngedintheorder 

that they willbepronouncedintheoutput«peech.(Col.5.1ines 1045). Areading system includ^^ 
aphonologicd feature analyzer. For each wordin the word Ust. the phonological feature an^^^ 

evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample from the library having the corresponding prosodic and 
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phonological characteristics. This is then added to a sample list, which is eventually output as the 

speech signal. (Cot. 5, lines 16-31). 

Regarding Claim 9, the Examiner indicates that the rejection of the claim is based on the 

same reason as described for Claim 1 . 

Regarding Claim 1, the Exammer postulates that *«it would have been obvious to one of 
ordinary skiU in the art at thetime the invention was made to combine ShannanandHata to pro^^^ 

a stored speech sample in a word or other larger units (the removed prefix or suffix may be good 
candidate units, since they must associate some pronunciation unit for outputting, anyway)." 
Applicant respectftdly disagrees. 

Again, the entire Sharman reference teaches away from processing whole words. 
In order to combine the Shaiman and Hata references as the Examiner has suggested - 
without the benefitofhmdsight-oneofordinaiTskiU in theart would have to: l)find and rea^ 

Shaimanreference;2)understandSharman'$phonemeanddiphone.based structures andoperati^^^ 
3) spontaneously decide-in spiteofSharman'steachingawayfiomprocessingwholewords-that, 

out of all of Sharman's system, only its elemental and basic stored diphones needed to be replaced 
with stored speech samples in word or larger units; 4) seek out and fmd Hata's system; 5) disregard 
all of Hata' s structures and methods, except for its library of pre-recorded samples for a given word 
and each tonal variation thereof; 6) selectively cull from Hata only its extensive library of pie- 
recorded samples for a given word and each tonal variation thereof; 7) disregard Sharman's 
expressed intent to avoid excess processing; 8) replace Sharman's diphone hbrary with Hata's 
extensive library of pre-recorded samples for a given word and each tonal variation thereof; 9) 
discardSharman'sextensiveteachingof, and structure and operations for,breakingwordsdownint^ 

their constituent syllables; 10) discard Sharman's strucmre and operations for phoneme duration 
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assignment; U) discard Sharman's structure and operations for breath group assembly, and 12) 
modify Sharman's remaining diphone-based structures and operations to successfully process and 
output Hata's pre-recorded word samples. 

AppUcant respectfully submits that there is no teaching or motivation in either Shaiman or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the art, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

Furthermore, such selective and substantial modification still Mstoteachor^^^^ 

limitations of Claim 1 and, subsequently, Claim 9. 

Applicant respectfully submits that Claim 9 overcomes the rejection based upon a highly 
speculative and selective combination the Shannan and Hata references. Claim 9 is allowable. 
AppUcant respectfully requests reconsideration and allowance of Claim 9 . 
e. Claim 10 

AppUcant respectfully traverses the Examiner's suggested interpretations of the cited 

references. 

AppUcant again respectfully submits that Shaiman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1, lines 65-67). 
Sharman discloses breaking down words into constituent syllables. A dictionary may be used for 
the purpose ofdeterminingsyUabic breaks. (Col. 5, lines 22-25). Constituent syllables are then 
further broken down into constituent phonemes. A dictionary look-up table - or some general 
putposerules-maybeusedfor1hispurpose.(Col.5,lines22-2S), Shannax. discloses removal and 
disregarding of any possible prefix or suffix, so as to enable the disaggregation of an underljdng 
word into phonemes. (Col. 5, lines 26-29). 
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Sh3nnfln<tisclosesaimotationofphonemeswithcettainc^^ 
(Col. 5. lines 22-25). Steps are then performed to assemble phonemes into "breath groups » (Col. 
5. line '48 - Col. 6, line 16). After the constituent phonemes and their characteristics have been 
determined, the acoustic processor determines diphones from the constituent phonemes. Each 
diphonerepresentsatransitionbetweentwophonemes.AdiphoneUbrary(with^^^^^ 

of the diphones) is accessed to retrieve the correspondmg diphone samples, which are then 
concatenated to produce theoutput signal See. generally.CoL5,hne 18-40; col. 6, lin^^ 

Therefore, Sharmanparsesthetextmedowntoconstituentphonemes.Agroupofphonemes 
arematched to diphones fromadiphoneUbrary.Soundunits are thengroupedandgeneratedusing 

diphones. Shannan suggests that its approach is desirable to avoid excess processing. (Col. 2, Imes 

22-33). 

The entire Shaiman reference teaches away from processing whole words. 

Thus far, the Examiner has conceded that: "Shaiman fails to explicitiy disclose utilizing 
'speech sample' for the speech in the diphone library for the phonetic data"; "Sharman does not 
expressly disclose 'each speech sample corresponding to a one of said words, ... in said 
vocabulary'";"Sharmandoes not expressly disclose'eachspeechsamplecorrespondingtoaoneof 

said,...prcfixes and suffixes in said vocabulary-'V'Shannan does not expressly d^^^^^^ 
saidvocabularyoftextural[sic1umtcompriseswords...eachhavingapr^recordedspeechs^^^ 

associated therewith"'; and "Shaiman does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises .., prefixes and suffixes each having a pre-recorded speech sample 

associated therewith.*" 

Applicant agrees with these concessions. 
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By comparison, Hata discloses a high-quality concatenative reading system for converting 
an input string into a sequence for subsequent audible synthesis. (Col. 1. lines 64-66). Hata 
discloses a dictionary of words and a word Ust generator coupled to the dictionary. The word Ust 
generator receives the input string and builds a word list fiom words stored in the dictionary 
cotrespondingtotheinput string. The wordlistgo^erator assigns oneornioreprosodicenvi^^ 
tokens to wordHstentries-preferablyto each entry in the word Ust. Areading system an^^^ 
wordlisttodeterminephonologicalfeaturesontheentries. Based on the word list, the environment 
token(s),andthephonologicalfeatures.thereadingsystemselects speech samplestote 

to supply the signal for audible synthesis. (Col. 1. line 66 - Col. 2, Une 41). 

Hata discloses a dictionary of digitaUy sampled sounds that have been recorded and stored 
inadvance. (Col. 3, lines 42-44). The dictionary includes different samples for each possible pitch 
contour (i.e., prosodic enviromnent) for each word in the dictionary. (Col. 4. lines 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionary may also 
store all pronunciation variants of each word for each prosodic environment. (CoL 4, lines 37-55). 
Ratals dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech. (Col. 4, lines 58-63). Hata discloses 
a word Ust generator including a prosodic enviromnent table that identifies the possible prosodic 
statesofthetexttospeechimplementation. (Col. 4. Unes 63-65). The word Ust generator builds a 
word Ust comprisingaword token, andapiosodictokenfor each word token,arranged in the ord^ 
that theywiUbepronouncedin the output speech.(Col.5.1inesia-15).Ar^g system includes 

a phonological feature analyzer. For each word in the word Ust, the phonological feature analyzer 
evaluates the preceding and following words. Based on this infomiation. the phonological feature 
analyzer selects a pre-recorded sample from the library having the corresponding prosodic and 
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phonological characteristics, llus is then added to a sample list, which is eventually output as the 

speech signal. (Col. 5, lines 16-31). 

Regarding Claim 1 0, the Examiner indicates that the rejection of the claim is based on the 

same reason as described for Claim 1 . 

Regarding Gaim 1, the Examiner postulates that "it would have been obvious to one of 
ordinaiy skill in the art at the time the invention was made to combine Sharman and Hata to provide 
a stored speech sample in a word or other larger units (the removed prefix or suffix may be good 
candidate units, since they must associate some pronunciation unit for outputting, anyway)." 

Applicant respectfully disagrees. 

Again, the entire Shaiman reference teaches away from processing whole words. 

In order to combine the Shaiman and Hata references as the Examiner has suggested - 
without the benemofhindsight-oneof ordinary Sldll in the art would have to: l)find and rea^ 
Shaimanrefereoce;2)under8tandSharman'8phonemeanddiphone.basedstmcture5andop«^^ 
3)spontaneously decide-in spiteofSharman'steachingawayfiom processing wholewords-that, 

out of all of Sharman's system, only its elemental andbasic stored diphones needed to be replaced 
with stored speech samples in word or larger umts; 4) seek out and findHata's system; 5) disregard 
ail of Hata's structures and methods, except for its Ubrary of pre-recoided samples for a given word 
and each tonal variation thereof; 6) selectively cull fifom Hata only its ext^sive Ubrary of pre- 
recorded samples for a given word and each tonal variation thereof; 7) disregard Shaman's 
expressed intent to avoid excess processing; 8) replace Sharman's diphone library with Hata's 
extensive Ubrary of pre-recorded samples for a given word and each tonal variation thereof; 9) 
discard Sharman's extensive teachmg of. andstructure andoperations for, breakingwords downinto 
their constituent syllables; 10) discard Sharman's structure and operations for phoneme duration 
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assignment; 11) discard Sharman's structure and operations for breath group assembly, and 12) 
modify Sbannan's remaining diphon^based structures and operations to successfiiUy process and 

output Hata*s pre-recorded word samples. 

Applicant respectfully submits that there is no teaching or motivation in either Sharman or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the ait, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

Furthermore, such selecHve and substantial modification still fails to teach or suggest all the 
Umitations of Claim 1 and, subsequently, Claim 10. 

Applicant respectfully submits that Claim 10 overcomes the rejection based upon a highly 
speculative and selective combination the Sharman and Hata references. Claim 10 is allowable. 
AppUcant respectfully requests reconsideration and allowance of Claim 10. 
f. Claim 11 

Applicant respectfully traverses the Examiner's suggerted interpretations of the cited 
references. 

Applicant again respectfully submits that Sharman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1 , lines 65-67). 
Sharman discloses breaking down words into constituent syllables. A dictionary may be used for 
the purpose ofdetermining syllabic breaks. (Col. 5. lines 22-25). Constituent syllables are then 
further broken down into constihaent phonemes. A dictionary look-up table - or some general 
p.rposerules-maybcusedforthispurpose.(Col.S,lines22-25). Sharman discloses removal and 
disregarding of any possible prefix or suffix, so as to enable the disaggregation of an underlying 
word into phonemes. (Col 5, lines 26-29). 
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ShannandisclosesaimotationofphonemeswithcertamcharacteristicsC^^ 
(Col. 5, lines 22-25). Steps axe then perfoimed to assemble phonemes into "breath groups." (Col. 
5. line 48 - Col. 6, line 16). After the constituent phonemes and their characteristics have been 
detennined, the acoustic processor determines diphones from the constituent phonemes, Each 
diphonei^resentsatransitionbetweentwophonemes.AdiphoneUbraiy(withprerecordedsoun^ 

of the diphones) is accessed to retrieve the conesponding diphone samples, which are then 
concatenated to produce the output signal. See, generally. Col. 5, line 18-40; Col. 6, Unes 22-38. 

Therefore, Sharmanpaises thetext file down to constituentphonemes. Agroup of phonemes 
arematchedto diphones from a diphone Ubrary. Sound units are then grouped and generated using 
diphones. Shaiman suggests that its approach is desirable to avoid excess processing. (Col. 2, lines 
22-33). 

The entire Sharman reference teaches away from processing whole words. 

Thus far. the Examiner has conceded that: "Sharman fails to expUciUy disclose utilizing 
•speech sample' for the speech in the diphone library for the phonetic data"; "Sharman does not 
expressly disclose 'each speech sample corresponding to a one of said words, ... in said 
vocabulary"'; "Sharman does not expressly disclose 'each speech sample correspondingtoaoneof 

said, . . . prefixes and suffixes in said vocabulary'"; "Sharman does not expressly disclose 'wherein 
said vocabulary oftextural[sic]umt comprises words...eachhavingapre-reconied speech SM^^^^ 

associated therewith'"; and "Sharman does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises ... F^fixes and suffixes each having a pre-iecorded speech sample 

associated therewith '" 

Applicant agrees with these concessions. 
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By comparison. Hata discloses a high-quality concatenative reading system for converting 
an input string into a sequence for subsequent audible sjoiUiesis. (Col. 1. lines 64-66). Hata 
discloses a dictionary of words and a word list generator coupled to the dictionary. The word list 
generator receives the input string and builds a word list from words stored in the dictionary 
correspondingtoihe input string. The wordlist generator assignsoneormoreprosodicenvir^^ 
tokens to wordlistentries-preferably to eachentryinthewordlist-Areadingsystemanalyzesth^ 
word Usttodeterminephonologicalfeaturesonthe entries. BasedonthewordUst,theenvir^^^^ 

token(s),andthephonologicalfeamres.thereadingsystemselects speech samp^^^ 
to supply the signal for audible synthesis. (Col, 1, line 66 - Col. 2, line 41). 

, Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
inadvance. (Col. 3, lines 42-44). The dictionary includes different samples for each possible pitch 
contour (i.e.. prosodic environment) for each word in the dictionary. (Col. 4, lines 28-31). In 
addition to storing an entiy for each prosodic enviromnent of each word, the dictionary may also 
store all pronunciation variants of each word for each prosodic enviromnent (Col. 4, lines 37-55). 
Hata's dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech. (Col. 4, lines 58-63). Hata discloses 
a word hst generator including a prosodic enviromnent table that identifies the possible prosodic 
statesofthetexttospeechimplementation. (Col. 4. lines 63^5), Tl.e word Ust generator builds a 
wordlistcomprisingawordtoken, and a prosodic token for each word token, arranged in the order 
that they will be pronounced in the output speech. (Col. 5. lines 10-1 5). A reading system includes 
aphonological feature analyzer. For each word in the word list, the phonological feature analyzer 
evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a prerecorded sample from ti»e library having the corresponding prosodic and 
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ph«»ologicd ch.n»=t«isfics. This is th« added to a san-ple Us., which is evertuaJly output as the 

speech signal. (Col. 5, lines 16-31). 

Regarding Claim 11. the Examiner indicates that the rejection of the claim is based on the 

same reason as described for Claim 1 . 

Regarding Claim 1. the Examiner postulates that "it would have been obvious to one of 
ordinary skiUintheartatthetimethe invention wasmadetocombineSharmaxiandHato 
astored speech samplemaword or other largerunitsCtheremovedprefix or suffix may be good 

candidate units, since fliey must associate some pronunciation mrit for outputting, anyway)." 
Applicant respectfully disagrees. 

Again, the entire Sharman reference teaches away from processing whole words, 
m order to combine the Sharman and Hata references as the Examiner has suggested - 
withoutthebenemofhindsight-one of ordinary skUlinthe art wouldhaveto:l)findandre^ 

Shaxmanrefercnce;2)understandShannan'sphonemeanddiphonc-basedstruoturesando 
3)spontaneously decide-in spiteofSharman'steacWngawayfiromprocessingwhoIewords-that. 

out of all of Sharman's system, only its elemental and basic stored diphones needed to be replaced 
vdth stored speechsamplesinword or larger units; 4) seekout and findHata'ssystem; 5) disreg^^ 
all ofHata'sstnicturesandmethods. except for itsUbraryofpie-recordedsamplesforagivenwor^ 

and each tonal variation thereof; 6) selecttvety cull from Hata only its extensive Ubrary of pre- 
recorded samples for a given word and each tonal variation thereof, 7) disregard Sharman's 
expressed intent to avoid excess processing; 8) replace Sharman's diphone library with Hata's 
extensive library of pre-recorded samples for a given word and each tonal variation thereof; 9) 
discardSharman'sextensiveteachingof,andstructureandoperationsfor.breald^^ 
their constituent syllables; 10) discard Sharman's structure and operations for phoneme duration 
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assignment; 11) discard Shatman's structure and operations for breath group assembly, and 12) 
modify Shaiman's remaining diphono-based structures and operations to successfully process and 
output Hata's pre-recorded word samples. 

AppUcant respectfully submits that there is no teaching or motivation in either Shaiman or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the art. or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

Furthermore, such selectiveandsubstantialmodificationstiUfaUsto teach or suggest aU^^ 

Umitations of Claim 1 and, subsequently, Claim 11. 

AppKcant respectfully submits that Claim 1 1 overcomes the rejection based upon a highly 
speculative and selective combination the Shaiman and Hata references. Claim 1 1 is allowable. 
Applicant respectfully requests reconsideration and allowance of Claim 11. 

Claim 12 

AppUcant respectfully traverses the Examiner's suggested interpretations of the cited 
references. 

AppUcant again respectfully submits that Shaiman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1 , Unes 65-67). 
Sharman discloses breaking down words into constiuient syllables. A dictionary may be used for 
thepurposeofdeterminingsyUabic breaks. (Col. 5. lines 22-25). Constituent syUables are then 
further broken down into constituent phonemes. A dictionary look-up table - or some general 
putposerules-maybeused for thispurpose.(Col.5.1ines 22-25). Sharmandisclosesr^^^^ 

disregarding of any possible prefix or sufiBx, so as to enable the disaggregation of an underlying 

word into phonemes. (CoL 5, lines 26-29). 
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Shaiman discloses aniMitatioftofphonemeswiA certain characteristics (e.g., pitch, duration). 
(Col. 5, lines 22-25). Steps are tiien performed to assemble phonemes into "breath groups." (Col. 
5, line 48 - Col. 6, line 16). After the constitaent phonemes and ttieir characteristics have been 
determined, Ae acoustic processor determines diphones from the constituent phonemes. Each 
diphone represents a transition between two phonemes. A diphoneUbrary (with prerecorded sounds 
of the diphones) is accessed to retrieve die corresponding diphone samples, which are then 
concatenated to produce the output signal. See, generally. Col. 5, line 18-40; Col. 6, lines 22-38. 

Therefore, Sharmanparses the text file down to constituent phonemes. A group of phonemes 
are matched to diphones from a diphone library. Sound units are then grouped and generated using 
diphones. Sbarman suggests that its approach is desirable to avoid excess processing. (Col. 2, lines 
22-33). 

The entire Sharman reference teaches away from processing whole words. 

Thus far, the Examiner has conceded that: "Sharman fails to expUcitly disclose utilizing 
♦speech sample' for the speech in the diphone library for the phonetic data"; "Sharman does not 
expressly disclose 'each speech sample corresponding to a one of said words, ... in said 
vocabulajy"'; "Sharman does not expressly disclose 'each speech sample corresponding to a one of 
said, . prefixes and suffixes in said vocabulary"'; "Sharman does not expressly disclose 'wherein 
said vocabulary of textural [sic] unit comprises words ... each having a prerecorded speech sample 
associated therewith'"; and "Shaiman does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises - . . prefixes and sufiBxes each having a pre-recorded speech sample 
associated therewith.'" 

Applicant agrees with these concessions. 
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By comparison, Hata discloses a high-quality concatenative reading system for converting 
an input string into a sequence for subsequent audible synthesis. (Col. 1, lines 64-66). Hata 
discloses a dictionary of words and a word Ust generator coupled to the dictionary. The word Ust 
generator receives the input string and builds a word list from words stored in the dictionary 
corresponding to the input string. The word list generator assigns one or more prosodic environment 
tokens to word list entries -preferably to each entry in the word list A reading system analyzes the 
word list to determine phonological features on the entries. Based on the word list, the environment 
token(s), and the phonological features, thereading system selects speechsamples to be concatenated 
to supply the signal for audible synthesis. (Col. 1, line 66 - Col. 2, line 41). 

Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
in advance. (Col. 3. lines 42-44). The dictionary includes different samples for each possible pitch 
contour (i.e., prosodic environment) for each word in the dictionary. (Col; 4, lines 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionary may also 
storeallpronunciation variants of each word for each prosodic environment. (Col. 4, lines 37-55). 
Hata's dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech, (Col. 4, hnes 58-63). Haia discloses 
a word list generator including a prosodic environment table thai identifies the possible prosodic 
states of the text to speech implementation. (Col. 4, lines 63-65). The word list generator builds a 
word list comprising a word token, and a prosodic token for each word token, arranged in the order 
that they wiU be pronounced in the output speech. (Col. 5, lines 10-1 5). A reading system includes 
a phonologicaJ feature analyzer. For each word in the word list, the phonological feature analyzer 
evaluates the preceding and foUowing words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample fiom the library having the corresponding prosodic and 
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phonologioal characteristics. TOs is then added to a sample list, which is eventually output as the 

speech signal. (Col. 5, lines 16-31). 

Regarding Claim 12. the Examiner indicates that the rejection of the claim is hased on the 

same reason as described for Claims 1 and 6. 

RegardingClaims 1 and 6, the Examinerposmlates that "it would have been obvious to one 

of ordinary skill in the art at the time the invention was made to combine Shamian and Hata to 
provideastoredspeechsampleinawordorotherlargerunits(theremovedprefixorsuffixmaybe 

good candidate units, since they must associate some pronunciation unit for outputting. anyway)." 
Applicant respectfully disagrees. 

Again, the entire Sharman reference teaches away from processing whole words. 

In order to combine the Sharman and Hata references as the Examiner has suggested - 
without thebenefitofhindsight-one of ordinary skill in the art would have to. 1) fmdandreadthe 
Shamianreference;2)understandSharman'sphonemeanddiphon^basedstructuresandope^^^ 
3)spontaiieouslydecide-inspiteofShannan'steachingawayfromprocessingwholewords-fc^^^ 
outofallofShamian'ssystem,onlyits elemental andbasicstoreddiphones needed to be improved 
to include larger stored speech samples in word or larger imits; 4) seekout and find Hata'ssyst^^ 

5) disregard all of Hata' s structures and methods, except for its library of pre-recorded samples for 
agiven word andeachtonal variation thereof; 6).efe./iv«fycullfromHa^ 

of pre-recorded samples for a given word and each tonal variation thereof; 7) disregard Sharman's 
expressed intent to avoid excess processing; 8) augment Sharman's diphone library with Hata's 
extensive library of pre-recorded samples for a given word and each tonal variation thereof; 9) 
selectivelydiscardShamian'sexten5iveteachingof.andstructureandoperationsfor.breaki^^ 

downinto their constituent 8yUables;lO)selectively discard Sharman'sstructu^ 
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phonemeduration assignment; 1 1) selectively discard Shannan's struoture and operations for breath 
group assembljr, and 12) substantially modify and supplement Shaman's syllabic, phoneme and 
diphone-basedstructuresaad operations to successfollyprocessandoutputHata'spre-re^^ 

samples. 

AppUcant respectfully submits that there is no teaching or motivation in either Shannan or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the art, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

The Examiner then goes on to speculate that "it would have been obvious to one of ordinary 
skill in the art at the time the invention was made to modify Sharman in view of Hata for specifically 
providing a mechanism to treat a prefix or suffix as same way [sic] as a word in a dictionary, as 
taught by DIG, so that each stored speech sample can correspond to one of word, prefix and suffix, 
for the purpose of selecting the appropriated [sic] 'granularit/ or dictionary entry size to suit the 

specific ^plication." 

Applicant respectfully disagrees. 

In order to fiirther modify the already substantially modified Shawnan/Hata combination as 
the Examiner has suggested-without the benefitofMndsight-oneofordinary skill in the artwould 

have to : 1) spontaneously decide that ^ to the extent that either Sharman or Hata disclose or suggest 
treatmentofprefixesorsuffixes-bothreferences were somehow deficient orincompletewithrcgard 

thereto; 2) seek out and find the DIG reference; 3) arbitrarily decide, based upon a dictionary entry 
for either a prefix or suffix, that prefixes and suffixes should be treated as words; and 4) further 
modify the structiires and operations of the abeady substantially modified Shaiman/Hata 
combination to handle prefixes and suffixes as words. 
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Again, AppUcant respectftilly submits that there is no teaching or motivation in the cited 
references to suggest such highly selective and highly speculative modification to one of ordinary 
skill inihe art, or to provide a reasonable expectation of successfully completing such modification. 

Furthermore, such selective and substantial modifications stiU fail to teach or suggest all the 
limitations of Claims 1 and 6 and, subsequaitly, claim 12. 

Applicant respectfiilly submits that Claim 12 overcomes the rejection based upon a highly 
speculative and selective combination the Sharman, Hata and DIG references. Claim 12 is 
allowable. Applicant respectfully requests reconsideration and allowance of Claim 12. 
Ji. Claims 2 and 3 

Claims 2 and 3 depend from allowable Claim 1, and provide fitfther limitations not taught 

or suggested by either Sharman or Hata, 

Claim 2, and Claim 3 depending thereftom, requires passing of an indicated textual unit to 
a secondary text to speech engine, and receiving a speech sample converted firom the indicated 
textual unit from the secondary text to speech engine, 

The Examiner has conceded that "Sharman in view of Hata and DIC does not expressly 
disclose 'passing said indicated textual unit to a secondary text to speech engine; receiving a speech 
sample converted from said indicated textual unit from said secondary text to speech engine'." 

Applicant agrees. 

The Examiner nonetheless goes on to suggest that one of ordinary skill in the art would be 
taught or motivated by a combination that does not disclose a secondary text to speech engine to: 1) 
seek out a fourth reference, the Oh reference; 2) disregard Oh extensive teaching of text to speech 
processing on a character-by-character basis; 3) selectively cull from Oh only a secondary text to 
speech processor, and 4) further modify the stmcture and operation of the already substantiaUy 
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modified Sharman/Hata/DIC combination to integrate the secondary text to speech processor torn 
Oh. 

AppUcantrespectfollysubmits thai there is no teaching or motivation in the cited references 
to suggest such highly selective and highly speculative modification to one of ordinary skill in the 
art, or to provide a reasonable expectation of successfully completing such modification. 

Furthermore, such selective and substantial modifications still fail to teach or suggest all the 

limitations of Claims 2 and 3. 

Claims 2 and 3 are thus allowable. Applicant respectfiilly requests reconsideration and 

allowance of Claims 2 and 3. 

AppUcant respectfully traverses the Examiner's suggested interpretations of the cited 
references. 

Applicant again respectfully submits that Sharman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1 , lines 65-67). 
Sharman discloses breaking down words into constituent syllables. A dictionary may be used for 
the purpose of determining syllabic breaks. (Col. 5. Unes 22-25). Constituent syllables are then 
further broken down into constituent phonemes. A dictionary look-up table - or some general 
purposerules -may be used for this purpose. (Col. 5, lines 22-25). Shannan discloses removal and 
disregarding of any possible prefix or suffix, so as to enable the disaggregation of an underlying 
word into phonemes. (Col. 5, lines 26-29). 

Sharman discloses annotation of phonemes with certain characteristics (e.g.,pitch, duration). 
(Col. 5, lines 22-25). Steps are then performed to assemble phonemes into '*breath groups." (Col. 
5, line 48 - Col 6, hne 16). After the constituent phonemes and their characteristics have been 
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detennined, the acoustic processor detennines diphones from the constituent phonemes. Each 
diphone represents a transition between two phonemes. A diphone library (with prerecorded sounds 
of the diphones) is accessed to retrieve the corresponding diphone samples, which are then 
concatenated to produce the output signal. See, generally. Col. 5, line 18-40; Col. 6, lines 22-38. 

Therefore, Sharman parses the text file down to constituent phonemes, A group of phonemes 
arc matched to diphones from a diphone library, Sound units are th«i grouped and generated using 
diphones, Sharman suggests that its approach is desirable to avoid excess processing. (Col. 2, lines 
22-33), 

The entire Shannan reference teaches away from processing whole words. 

Thus far, the Examiner has conceded that: "Sharman fails to explicitly disclose utilizing 
•speech sample* for the speech in the diphone library for the phonetic data"; "Sharman does not 
expressly disclose 'each speech sample corresponding to a one of said words, in said 
vocabulary'"; "Sharman does not expressly disclose 'each speech sample corresponding to a one of 
said, . , , prefixes and suffixes in said vocabulary*"; "Sharman does not expressly disclose 'wherein 
said vocabulary of textural [sicj unit comprises words , . . each having apre-recorded speech sample 
associated therewith"'; and "Sharman does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises . . . prefixes and sufBxes each having a pre-recorded speech sample 
associated therewith. 

Applicant agrees with these concessions. 

By comparison, Hata discloses a high-quality concatenative reading system for converting 
an input string into a sequence for subsequent audible synthesis. (CoL 1, lines 64-66). Hata 
discloses a dictionary of words and a word list generator coupled to the dictionary. The word list 
generator receives the input string and builds a word list from words stored in the dictionary 
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coirespondingto the input string. Thewordlistgeneratorassignsoneormoxeptosoa 
tokens to word list entries - preferably to each entry in the word list A reading system analyzes the 
wordlisttodetenninephonologicalfeaturesontheentries. Based on the word Ust, the environment 
token(s),andthephonolosicalfeatures,thereadingsystem selects speechsamples to becOT^^ 

to supply the signal for audible synthesis. (Col. 1, line 66 - Col. 2. Hne 41). 

Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
in advance. (Col. 3. lines 42-44). The dictionary includes different samples for each possiblepitch 
contour (i.e., prosodic environment) for each word in the dictionary- (Col. 4. lines 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionary may also 
store all pronunciation variants of each word for each prosodic environment. (Col. 4, lines 37-55). 
Hata's dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech. (Col 4, lines 58-63). Hata discloses 
a word hst generator including a prosodic environment table that identifies the possible prosodic 
states ofthe text to speech implementation. (Col. 4, lines 63-65). The word Ust generator builds a 
word Ust comprising a word token, and a prosodic token for each word token, arranged in the order 
that they wUl be pronounced in the output speech. (CoL 5, lines 10-15). Areading system includes 
a phonological feature analyzer. For each word in the word Ust, the phonological feature analyzer 
evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample jftom the library having the corresponding prosodic and 
phonological characteristics. This is then added to a sample list, which is eventually output as the 
speech signal. (Col. 5, lines 16-31). 

Regarding Claim 21, the Examiner indicates that the rejection ofthe claim is based on the 

same reason as described for Claims 1, 2 and 6. 
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Regarding Claims I and 6, the Examiner postulates that "it would have been obvious to one 
of ordinary skill in the art at the time the invention was made to combine Shamian and Hata to 
provide a stored speech sample in a word or other larger units (the removed prefix or sufifix may be 
good candidate units, since they must associate some pronunciation unit for outputting, anyway)." 

Applicant respectfully disagrees. 

Again, the entire Sharman reference teaches away ftom processing whole words. 

In order to combine the Sharman and Hata references as the Examiner has suggested - 
without the benefit of hindsight - one of ordinary skill in the ait would have to: 1) find and read the 
Sharman reference; 2) understand Shaxman's phoneme and diphone-based structures and operations; 
3) spontaneously decide - in spite of Sharman's teaching away fi-om processing whole words - that, 
out of all of Shannan's system, only its elemental and basic stored diphones needed to be improved 
to include larger stored speech samples in word or larger units; 4) seek out and find Hata's system; 
5) disregard all of Hata's structures and methods, except for its library of pre-recorded samples for 
a given word and each tonal variation thereof; 6) selectively cull from Hata only its extensive library 
of pre-recorded samples for a given word and each tonal variation thereof; 7) disregard Sharman's 
expressed intent to avoid excess processing; 8) augment Sharman's diphone hbrary with Hata*s 
extensive library of pre-recorded samples for a given word and each tonal variation thereof; 9) 
selectively discard Shannan*s extensive teaching of, and structure and operations for, breaking words 
down into their constituent syllables; 10) selectively discard Shannan's structure and operations for 
phoneme duration assignment; 1 1) selectively discard Sharman's structure and operations for breath 
group assembly; and 12) substantially modify and si^lement Sharman*s syllabic, phoneme and 
diphone-based structures and operations to successfullyprocess and output Hata's pre-recorded word 
samples. ^ 
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Applicant tespectfiilly submits that there is no teaching or motivation in either Shannan or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the art, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

The Examiner then goes on to speculate that "it would have been obvious to one of ordinary 
skUl in the art at the time the invention was made to modify Shaimanin view ofHatafor specifically 
providing a mechanism to treat a prefix or suffix as same way (sic] as a word in a dictionary, as 
taught by DIG, so that each stored speech sample can correspond to one of word, prefix and suffix, 
for the purpose of selecting the appropriated [sic] 'granularity' or dictionary entry size to suit flie 

specific application;" 

Applicant respectfully disagrees. 

In order to further modify the aheady substantially modified Sharmaa/Hata combination as 
the Examiner has suggested-without the benefit ofhindsight-oneof ordinary skill in the art would 

have to: 1) spontaneously decide that- to the extent that either Shannan or Hata disclose or suggest 
treatment ofprefixes or sufifixes-both references were somehow deficient orincoropletewithregard 

thereto; 2) seek out and find the DIC reference; 3) arbitrarily decide, based upon a dictionary entry 
for either a prefix or suffix, that prefixes and suffixes should be treated as words; and 4) further 
modify the structures and operations of the already substantially modified Sharman/Hata 
combination to handle prefixes and suffixes as words. 

Again, Applicant respcctfiilly submits that there is no teaching or motivation in the cited 
references to suggest such highly selective and highly speculative modification to one of ordinary 
skiU in the art, or to provide a reasonable expectation of successfiilly completing such modification. 
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Claim 2 depends from allowable Claim 1, and provides further limitations not taught or 

suggested by either Shaiman or Hata. 

Claim2 requirespassing of anindicated textual iinitto asecondarytextto speech engine, and 

receiving a speech sample converted from the indicated textual unit from the secondary text to 

speech engine. 

The Examiner has conceded that "Sharman in view of Hata and DIG does not expressly 
disclose 'passing said indicated textual unit to a secondary text to speech engine; receiving a speech 
sample converted from said indicated textual unit from said secondary text to speech engine*." 

Applicant agrees. 

The Examiner nonetheless goes on to suggest that one of ordinary skill in the art would be 
taught or motivated by a combination that does not disclose a secondary text to speech engine to: 1) 
seek out a fourth reference, the Oh reference; 2) disregard Oh extensive teaching of text to speech 
processing on a character-by-character basis; 3) selectively cull from Oh only a secondary text to 
speech processor; and 4) ftulher modiiy the structure and operation of the already substantially 
modified Shaiman/Hata/DIC combination to uitegrate the secondary text to speech processor from 
Oh. 

Applicant respectfully submits that there is no teaching or motivation in the cited references 
to suggest such highly selective and highly speculative modification to one of ordinary skill in the 
art, or to provide a reasonable expectation of successfully completing such modification. 

Furthermore, such selective and substantial modifications still fail to teach or suggest all the 
limitations of Claims 1, 2, 6 and, subsequenUy, 21. 
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Applicant respecifiilly submits that Claim 21 overcomes the rejection based upon a highly 
speculative and selective combination the Sharman, Hata, DIC and Oh Teferences. Claim 21 is 
allowable. Apphcant respectfully requests reconsideration and allowance of Claim 21, 
Claim 22 

Claim 22 depends from allowable Claim 21, and provides further limitations distinguishing 
over the cited references. 

Claim 22 is thus allowable. Applicant respectfully requests reconsideration and allowance 
of Claim 22. 

L Claim 23_ 

Applicant respectfully traverses the Examiner's suggested interpretations of the cited 
references. 

Applicant again respectfully submits that Sharman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1, lines 65-67). 
Shaiman discloses breaking down words into constituent syllables. A dictionary may be used for 
the purpose of deteimining syllabic breaks. (Col. 5, lines 22-25). Constituent syllables are then 
further broken down into constituent phonemes. A dictionary look-up table - or some general 
purpose rules - may be used for this purpose. (Col 5, lines 22-25). Sharman discloses removal and 
disregarding of any possible prefix or sufiBx, so as to enable the disaggregation of an underlying 
word into phonemes. (Col. 5, lines 26-29). 

Sharman discloses annotation of phonemes with certain characteristics (e.g. , pitch, duration). 
(Col. 5, lines 22-25). Steps are then performed to assemble phonemes into "breath groups." (CoL 
5, line 48 - Col. 6, line 16). After the constituent phonemes and their characteristics have been 
determined, the acoustic processor determines diphones from the constituent phonemes. Each 
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diphone represents a transition between two phonemes. A diphone Ubrary (with prerecorded sounds 
of the diphones) is accessed to retrieve the corresponding diphone samples, which are then 
concatenated to produce the output signal. See, generally, Col. 5, line 18-40; Col. 6, lines 22-38. 

Therefore, Shaiman parses the text file down to constituent phonemes. A group of phonemes 
are matched to diphones from a diphone Ubrary. Sound units are then grouped and generated using 
diphones. Sharman suggests that its approach is desirable to avoid excess processing. (Col. 2, lines 
22-33). 

The entire Sharman reference teaches away from processing whole words. 

Thus far, the Examiner has conceded that: "Sharman fails to explicitly disclose utilizing 
♦speech sample* for the speech in the diphone Ubrary for the phonetic data"; "Shaiman does not 
expressly disclose 'each speech sample corresponding to a one of said words. ... in said 
vocabulary'"; "Sharman does not expressly disclose 'each speech sample corresponding to a one of 
said, . . . prefixes and suffixes in said vocabulary'"; "Shaiman does not expressly disclose 'wherein 
said vocabulary of texnnal [sic] unit comprises words ... each having apre^recorded speech sample 
associated therewith"'; and "Sharman does not expressly disclose 'wherein said vocabulary of 
textural [sic] unit comprises . . . prefixes and sufiixes each having a pre-recorded speech sample 
associated therewith."' 

Apphcant agrees with these concessions. 

By comparison, Hata discloses a high-quaUty concatenative reading system for converting 
an input string into a sequence for subsequent audible synthesis. (Col. 1, lines 64-66). Hata 
discloses a dictionary of words and a word Ust generator coupled to the dictionaiy. The word Ust 
generator receives the input string and builds a word list from words stored in the dictionary 
corresponding to the inputstring. The wordUst generator assigns one or more piosodic environment 
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tokens to word list entries -preferably to each entry in the word Ust A reading system analyzes the 
word list to determine phonological features on the entries. Based on the word Ust, the environment 
token(s). and the phonological features, the reading system selects speech samples to be concatenated 
to supply the signal for audible synthesis. (CoL 1 , line 66 - Col. 2, line 41). 

Rata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
in advance. (Col. 3, lines 42-44). The dictionary includes different samples for each possible pitch 
contour (i.e., prosodic environment) for each word in ttie dictionary. (Col. 4, lines 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionary may also 
store all pronunciation variants of each word for each prosodic environment. (Col. 4, lines 37-55). 
Hata's dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech. (Col. 4, lines 58-63). Hata discloses 
a word list generator including a piosodic environment table that identifies the possible prosodic 
states of the text to speech implementation. (Col. 4, hnes 63-65). The word list generator builds a 
word list comprising a word token, and a prosodic token for each word token, arranged in the order 
that they will be pronounced in the output speech. (Col. 5, lines 10-15). A reading system includes 
a phonological feature analyzer. For each word in the word Ust, the phonological feature analyzer 
evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample from the library having the corresponding prosodic and 
phonological characteristics. This is then added to a sample list, which is eventually output as the 
speech signal. (Col. 5, lines 16-31). 

Regarding Claim 23. the Examiner indicates that the rejection of the claim is based on the 
same reason as described for Claims 1 , 2 and 6. 
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Regarding Claims 1 and 6, the Examiner postulates that "it would have been obvious to one 
of ordinary skill in the art at the time the invention was made to combine Sharman and Hata to 
provide a stored speech sample in a word or other larger units (the removed prefix or suffix may be 
good candidate units, since they must associate some pronunciation unit for outputting, anyway).** 

Applicant respectfully disagrees. 

Again, the entire Shamian reference teaches away firora processing whole words. 

In order to combine the Sharman and Hata references as the Examiner has suggested - 
without the benefit of hindsight - one of ordinary skill in the art would have to: I) find and read the 
Shannan reference; 2) understand Sharman's phoneme and diphone-based structures and operations; 
3) spontaneously decide - in spite of Sharman' s teaching away from processing whole words - that, 
out of all of Shaiman's system, only its elemental and basic stored diphones needed to be improved 
to include larger stored speech samples in word or larger units; 4) seek out and find Hata*s system; 
5) disregard all of Hata's structures and methods, except for its library of pre-recorded samples for 
a given word and each tonal variation thereof; 6) selectively cull from Hata only its extensive library 
of pre-recorded samples for a given word and each tonal variation thereof; 7) disregard Shaiman's 
expressed intent to avoid excess processing; 8) augment Shannan*s diphone library with Hata's 
extensive library of prerecorded samples for a given word and each tonal variation thereof; 9) 
selectively discard Sharman's extensive teaching of, and structure and operations for, breaking words 
down into their constituent syllables; 10) selectively discard Sharman's structure and operations for 
phoneme duration assignment; 1 1) selectively discard Shannan' s structure and operations for breath 
group assembly; and 12) substantially modify and supplement Sharman's syllabic, phoneme and 
diphone-based structures and operations to successfully process and output Hata's pre-recorded word 
samples. 
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Applicant respectfully submits that there is no teaching or motivation in either Shaiman or 
Hata to suggest such selective and sabstantial modification to one of ordinary skill in the ait, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

The Examiner then goes on to speculate that "it would have been obvious to one of ordinary 
skill in the art at the time the invention was made to modify Sharman in view of Hata for specifically 
providing a mechanism to treat a prefix or sufBx as same way [sic] as a word in a dictionary, as 
taught by DIC, so that each stored speech sample can conespond to one of word, prefix and suffix, 
for the purpose of selecting the appropriated [sic] 'granularity' or dictionary entry size to suit the 

specific application." 

Applicant respectfully disagrees. 

In order to fiirther modify the already substantially modified Sharman/Hata combination as 
the Examiner has suggested - without the benefit of hindsight- one of ordinary skill in the art would 
have to: 1) spontaneously decide that - to the extent that either Sharman or Hata disclose or suggest 
treatmentofprefixes or suffixes-bothreferenceswere somehow deficient or incomplete withregard 

thereto; 2) seek out and find the DIC reference; 3) arbitrarily decide, based upon a dictionary entry 
for either a prefix or suffix, that prefixes and suffixes should be Heated as words; and 4) fiirther 
modify the structures and operations of the aheady substantially modified Sharman^ata 
combination to handle prefixes and sujEfixes as words. 

Again, AppUcant respectfulty submits that there is no teaching or motivation in the cited 
refei«nces to suggest such highly selective and highly speculative modification to one of ordinary 
skill in the art, or to provide areasonable expectation of successfiiUy completing such modification. 
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Claim 2 depends from allowable Claim 1, aixd provides ftlrther limitations not taught or 
suggested by either Shaman or Hata- 

Claim 2 requires passing of an indicated textual unit to a secondary text to speech engine, and 
receiving a speech sample converted from the indicated textual unit from the secondary text to 
speech engine. 

The Examiner has conceded that "Sharman in view of Hata and DIG does not expressly 
disclose 'passing said indicated textual unit to a secondary text to speech engine; receiving a speech 
sample converted from said indicated textual unit from said secondary text to speech engine'." 

Applicant agrees. 

The Examiner nonetheless goes on to suggest that one of ordinary skill in the art would be 
taught or motivated by a combination that does not disclose a secondary text to speech engine to: 1) 
seek out a fourth reference, the Oh reference; 2) disregard Oh extensive teaching of text to speech 
piocessing on a character-by-character basis; 3) selectively cull from Oh only a secondary text to 
speech processor; and 4> further modify the structure and operation of the aheady substantially 
modified Sharman/Hata/DIC combination to integrate the secondary text to speech processor from 
Oh- 

Applicant respectfully submits that there is no teaching or motivation in the cited references 
to suggest such highly selective and highly specidative modification to one of ordinary skill in the 
art, or to provide a reasonable expectation of successfully completing such modification, 

Furtheamore, such selective and substantial modifications still fail to teach or suggest all the 
limitations of Claims 1, 2, 6 and, subsequently, 23. 
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Applicant respectfiiUy submits that Oaim 23 overcomes the rejection based upon a highly 
speculative and selective combination the Sharman, Hata, DIG and Oh references. Claim 23 is 
allowable. Applicant respectfully requests reconsideration and allowance of Claim 23. 
I Claim 7 

Claim 7 depends from allowable Claim 6, and provides further limitations not taught or 
suggested by Shaiman, Hata or DIC. 

Claim 7 requires marking of a parsed textual unit as being out of vocabulary, and adding the 
marked textual unit to the list 

The Examiner has previously conceded that "Sharman in view of Hata and DIC does not 
expressly disclose to mark a text unit that does not match the one either in dictionary or by rules 
sets." 

Applicant agrees. 

Nevertheless, the Examiner then contends that "it would have been obvious to one of 
ordinary skill in the art at the time the invention was made to modify Sharman by specifically 
marking a text unit of the processed data, as taught by Rl, for the purpose of distinguishing the text 
unit that is not in the dictionary and preparing for further processing stages." 

Despite this highly speculative assertion, Applicant finds no suggestion or motivation in 
Shaiman, Hata or DIC for one of ordinary skill in the art to: 1) read the Sharman reference; 2) 
selectively and substantially modify Shannan by Hata; 3) selectively supplement the Sharman/Hata 
modification widi a third reference, DIC; 4) spontaneously determine, after fmding no teaching or 
suggestion of marking textual units in a substantial and highly selective combination of three 
references, that marking textual units is desirable or necessary; 5) find a fourth reference, Rl; 6) 
selectively interpret the Rl reference to suggestion marking of textual units in the Sharman/Hata/DIC 
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combination; and 7) selectively modify the Shannan/Hata/DlC combination to add structure and 
operations for marking textual units. 

Applicant respectfully submits that there is no teaching or motivation in either Sharman or 
Hata or DIG to suggest such selective and substantial modification to one of ordinary skill in the art, 
or to provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

Furthermore, such selective and substantial modification still fails to teach or suggest all the 
limitations of Claim 7, 

Applicant respectfully submits that Claim 7 overcomes the rejection based upon a highly 
speculative and selective combination of four references- Claim 7 is allowable. Applicant 
respectfully requests reconsideration and allowance of Claim 7, 
na. ClaimL8_ 

Claim 8 depends from allowable Claim 7, and provides fiirther limitations distinguishing 
over the cited references. 

Applicant respectfully traverses the Examiner' s interpretation of the cited references, as well 
as the selective and speculative combinations thereof. 

Claim 8 is allowable. Applicant respectfully requests reconsideration and allowance of 
Claims. 

n. Claim 14 

Applicant respectfully traverses the Examiner^s suggested interpretations of the cited 
references. 

Applicant again respectfully submits that Sharman discloses a text to speech system for 
converting input text into an output acoustic signal simulating natural speech. (Col. 1, lines 65-67). 
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Sharman discloses breaking down words into constituent syllables. A dictionary may be used for 
the purpose of determining syllabic breaks. (Col 5, lines 22-25). Constituent syllables are then 
further broken down into constituent phonemes. A dictionary look-up table - or some general 
purpose rules - may be used for this purpose. (CoL 5, lines 22-25). Sharman discloses removal and 
disregarding of any possible prefix or suffix, so as to enable the disaggregation of an underlying 
word into i>bonemes. (Col. 5, lines 26-29). 

Sharman discloses annotation of phonemes with certain characteristics (e.g. , pitch, duration). 
(Col. 5, lines 22-25). Steps are then performed to assemble phonemes into **breath groups." (CoL 
5, line 48 - CoL 6, line 16). After the constituent phonemes and their characteristics have been 
determined, the acoustic processor determines diphones ftom the constituent phonemes. Each 
diphone represents a transition between two phonemes. A diphone library (with prerecorded sounds 
of the diphones) is accessed to retrieve the corresponding diphone samples, which arc then 
concatenated to produce the output signal. See, generally, Col. 5, line 18-40; CoL 6, lines 22-38. 

Therefore, Sharman parses the text file down to constituent phonemes. A group of phonemes 
are matched to diphones firom a diphone library. Sound units are then grouped and generated using 
diphones. Sharman suggests that its approach is desirable to avoid excess processing. (CoL 2, lines 
22-33). 

The entire Sharman reference teaches away from processing whole words. 

Thus far, the Examiner has conceded that: "Sharman fails to explicitly disclose utiliaing 
'speech sample' for the speech in the diphone library for the phonetic data"; "Sharman does not 
expressly disclose 'a data structure' including several fields"; "Sharman does not expressly disclose 
*each speech sample corresponding to a one of said words, , - . in said vocabulary"'; "Sharman does 
not expressly disclose *each speech sample corresponding to a one of said, . . . prefixes and suffixes 
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in said vocabulary' '^Shannan does not expressly disclose * wherein said vocabulary of textural [sic] 
unit comprise words each having a pre-recorded speech san^)le associated therewith"'; and 
"Sharraan does not expressly disclose * wherein said vocabulary of textural [sic] unit comprises . . . 
prefixes and suffixes each having a pre-recorded speech sample associated therewith,'" 
AppUcant agrees with these concessions. 

By comparison, Hata discloses a high-quahty concatenative reading system for converting 
an input string into a sequence for subsequent audible synthesis. (Col. 1, lines 64-66), Hata 
discloses a dictionary of words and a word list generator coupled to the dictionary. The word list 
generator receives the input string and builds a word list from words stored in the dictionary 
corresponding to the input string. The word hst generator assigns one or more prosodic environment 
tokens to word list entries - preferably to each entry in the word list. A reading system analyzes the 
word list to determine phonological features on the entries. Based on the word list, the environment 
token(5), and the phonological features, the reading system selects speech samples to be concatenated 
to supply the signal for audible synthesis. (Col. 1, line 66 - Col. 2, line 41). 

Hata discloses a dictionary of digitally sampled sounds that have been recorded and stored 
in advance. (Col. 3, lines 42-44). The dictionary includes different samples for each possible pitch 
contour (i.e., prosodic environment) for each word in the dictionary. (CoL 4, lines 28-31). In 
addition to storing an entry for each prosodic environment of each word, the dictionary may also 
store all pronunciation variants of each word for each prosodic enviroiunent (CoL 4, lines 37-55). 
Hata's dictionary thus comprises multiple variations of each word entry. 

Hata discloses input of text to be converted to speech. (Col. 4, lines 58-63). Hata discloses 
a word Hst generator including a prosodic environment table that identifies the possible prosodic 
states of the text to speech implementation. (Col. 4, lines 63-65). The word list generator builds a 
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word list comprising a word token, and a prosodic token for each word token, arranged in the order 
that they will be pronounced in the output speech. (Col 5, lines 10-15). A reading system includes 
a phonological feature analyzer. For each word in the word list, the phonological feature analyzer 
evaluates the preceding and following words. Based on this information, the phonological feature 
analyzer selects a pre-recorded sample from the library having the corresponding prosodic and 
phonological characteristics. This is then added to a sample list, which is eventually output as the 
speech signal. (Col. 5, lines 16-31). 

Regarding Claim 14, the Examiner postulates that "it would have been obvious to one of 
ordinary skill in the art at the time the invention was made to modify Sharman for specifically 
providing a stored speech sample corresponding to a word or other unit, as taught by Hata, for the 
purpose of selecting the appropriated [sic] 'granularity* or dictionary entry size to suit the specific 
application." 

Applicant respectfully disagrees. 

Again, the entire Sharman referaice teaches away from processing whole words. 

In order to combine the Sharman and Hata references as the Examinor has suggested - 
without the benefit of hindsight - one of ordinary skill in the art would have to: 1) find and read the 
Sharman reference; 2) understand Sharman's phonOTie and diphone-based structures and operations; 
3) spontaneously decide - in spite of Sharman*s teaching away from processing whole words - that, 
out of all of Shannan's system, only its elemental and basic stored diphones needed to be improved 
to include larger stored speech samples in word or larger units; 4) seek out and find Ratals system; 
5) disregard all of Hata*s structures and methods, except for its library of pre-recorded samples for 
a given word and each tonal variation thereof; 6) selectively cull from Hata only its extensive Ubrary 
of pre-recorded samples for a given word and each tonal variation thereof; 7) disregard Sharman*s 
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expressed intent to avoid excess processing; 8) augment Shannan^s diphone library with Hata*s 
extensive library of pre-recorded samples for a given word and each tonal variation thereof; 9) 
selectively discard Sharman *s extensive teaching o£ and structure and operations for, breaking words 
down into their constituent syllables; 10) selectively discard Sharman*s structure and operations for 
phoneme duration assignment; 1 1 ) selectively discard Shannan's structure and operations for breath 
group assembly; and 12) substantially modify and supplement Sharman* s syllabic, phoneme and 
diphone-based structures and operations to successfully process and ou^ut Hata's pre-recorded word 
samples. 

Applicant respectfully submits that there is no teaching or motivation in either Sharman or 
Hata to suggest such selective and substantial modification to one of ordinary skill in the ait, or to 
provide a reasonable expectation of successfully completing such selective and substantial 
modification. 

The Examiner then goes on to speculate that "it would have been obvious to one of ordinary 
skill in the art at the time the invention was made to modify Sharman in view of Hata for specifically 
providing a mechanism to treat a prefix or suffix as same way [sic] as a word in a dictionary, as 
taught by DIC, so that each stored speech sample can correspond to one of word, prefix and suffix, 
for the purpose of selecting the appropriated [sic] 'granxUarity' or dictionary entry size to suit the 
specific application." 

Applicant reispectfuUy disagrees. 

In order to further modify the already substantially modified Shaiman/Hata combination as 
the Examiner has suggested - without the benefit of hindsight - one of ordinary skill in the art would 
have to: 1) spontaneously decide that - to the extent that either Sharman or Hata disclose or suggest 
treatment of prefixes or suffixes -both references were somehow deficient or incomplete with regard 
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thereto; 2) seek out and find the DIC reference; 3) arbitrarily decide, based upon a dictionary entry 
for either a prefix or suffix, that prefixes ajid suffixes should be treated as words; and 4) further 
modify the structures and operations of the abeady substantially modified Sharman/Hata 
combination to handle prefixes and suffixes as words. 

Again, Applicant respectfully submits that there is no teaching or motivation in the cited 
references to suggest such highly selective and highly speculative modification to one of ordinary 
skill in the art, or to provide a reasonable expectation of successfully completing such modification. 

The Examiner then concedes that "Sharman in view of Hata and DIC does not expressly 
disclose the data structure having 'a field for a frequency of a first portion of the speech sample that 
exceeds an amplitude threshold, and a field for a fi^uency of a last portion of the speech sample that 
exceeds ax) amplitude threshold\'* 

Nevertheless, the Examiner goes on to contend that "it would have been obvious to one of 
ordinary skill in the art at the time the invention was made to modify Sharman in view of Hata and 
DIC by specific ally providing data structures having multiple fields for finequency or time (duration) 
inforaiation for processing and storing speech data, as taught by Malsheen/* 

Despite this highly speculative assertion. Applicant fmds no suggestion or motivation in 
Sharman, Hata or DIC for one of ordinary skill in the art to: 1) read the Sharman reference; 2) 
selectively and substantially modify Sharman by Hata; 3) selectively supplement the Sharman/Hata 
modification with a third reference, DIC; 4) spontaneously determine, after finding no teaching or 
suggestion of fields for fi-equency of first or last portions of speech samples that exceed amplitude 
thresholds in a substantial and highly selective combination of three references, that '^providing data 
structures having multiple fields for frequency or time (duration) inforaiation for processing and 
storing speech data" is desirable or necessary; 5) find a fourth reference, Malsheen; 6) selectively 
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cull from Malsheen only the idea of "providing data structures having multiple fields for frequency 
or time (duration) infoimation for processing and storing speech data" while disregarding the 
remainder of Malsheen^s structures and metiiods utilizing consonant and vowel allophones and 
parameters related thereto; and 7) selectively modify the Sharman/Hata/DlC combination to add 
structure and operations for "data structures having multiple fields for frequency or time (duration) 
infomiation for processing and storing speech data". 

Applicant respectfully submits that there is no teaching or motivation in either Sharman or 
Hata or DIG to suggest such selective and substantial modification to one of ordinary skill in the art, 
or to provide a reasonable expectation of successfiiUy completing such selective and substantial 
modification. 

Furthemiore, such selective and substantial modification still fails to teach or suggest all the 
limitations of Claim 14. 

Applicant respectfiilly submits that Claim 14 overcomes the rejection based upon a highly 
speculative and selective combination the Sharman, Hata, DIG and Malsheen references. Claim 14 
is allowable. Applicant respectfiilly requests reconsideration and allowance of Claim 14, 
0, Claims 15, 19 and 20 

Claims 15, 19 and 20 depend from allowable Claim 14, and provide further limitations 
distinguishing over the cited references. 

Applicant respectfully traverses the Examiner's interpretation of the cited references, as well 
as the selective hindsight combinations thereof 

Claims IS, 19 and 20 are allowable. Applicant respectfiilly requests reconsideration and 
allowance of Claims 15, 19 and 20. 
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^ Claim 16 

Claim 16 depends from allowable Claim 7, and provides further limitations distinguishing 
over the cited references. 

Applicant respectfully traverses the Examiner's interpretation of the cited references, as well 
as the selective hindsight combination of five references. 

Claim 16 is allowable. Applicant respectfully requests reconsideration and allowance of 
Claim 16. 

fl^ Claim 18 

Claim 18 depends from allowable Claim 12, and provides further limitations distinguishing 
over the cited references. 

Applicant respectfully traverses the Examiner' s interpretation of the cited references, as well 
as the selective and speculative combination of five references. 

Claim 18 is allowable. Applicant respectfully requests reconsideration and allowance of 
Claim 18. 

L Claim 17 

Claim 17 depends from allowable Claim 8, and provides further limitations distinguishing 
over the cited references. 

Applicant respectfully traverses the Examiner's interpretation of the cited references, as well 
as the selective and speculative combination of six references. 

Claim 17 is allowable. Applicant respectfully requests reconsideration and allowance of 
Claim 17. 

Accordingly, the Apphcant respectfully requests withdrawal of the § 103(a) rejections of 
Claims 1-12 and 14-^23. 
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rv. CONCLUSION 

As a result of the foregoing, the i^plicant asserts that the remaining Claims in the 
Application are in condition for allowance, and respectfully requests an early allowance of such 
Claims. 
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If any issues arise, or if the Examiner has any suggestions for expediting allowance of this 
Application, the Applicant respectfully invites the Examiner to contact the undersigned at the 
telephone number indicated below or at rmccutcheon@4avismunck.com. 

The Commissioner is hereby authorized to charge any additional fees connected with this 
communication or credit any overpayment to Davis Munck Deposit Account No. 50-0208. 



P.O. Drawer 800889 

Dallas, Texas 75380 

(972) 628-3632 (direct dial) 

(972) 628-3600 (main number) 

(972) 628-3616 (fox) 

E-mail: rmccutcheon@davismmck.com 



Respectfully submitted, 



Davis MUNCK, P.C. 





Robert D. McCutcheon 
Registration No, 38,717 
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